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Appl. No.: to be assigned 
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Amendments to the Claims: 

This listing of claims will replace all prior versions, and listings, of claims in the application: 
Listing of Claims: 

1 . (currently amended) A method of processing digitized textual information, the 
information being organized in terms, documents and document corpora, where each document 
contains at least one term and each document corpus contains at least one document, the method 
comprising: 

generating a concept vector for each document in a document corpus^ wherein the 
concept vector conceptually classifying the contents of the document on a relatively compact 
format, and 

generating, for each term in the document corpus, a term-to-concept vector describing a 
relationship between the term and each of the concept vectors , charact e riz e d b y wherein the 
term-to-concept vectors being generated on basis of the concept vectors, and the method 
comprising comprises: 

receiving the term-to-concept vectors for the document corpus and on basis thereof 
generating a term-term matrix describing a term-to-term relationship between the terms in the 
document corpus, and 

processing the term-term matrix into processed textual information. 

2. (currently amended) A method according to claim L characteriz e d b y wherein each 
document in the document corpus being associated with a document-concept matrix representing 
at least one concept element whose relevance with respect to the document is described by a 
weight factor, the generation of each term-to-concept vector comprising comprises : 

identifying a term-relevant set of documents in the document corpus, each document in 
the term-relevant set containing at least one occurrence of the term, 

calculating a term weight for the term in each of the documents in the term-relevant set, 
retrieving a respective concept vector being associated with each document in the term- 
relevant set where the term weight exceeds a first threshold value, 



2 



Appl. No.: to be assigned 
Preliminary Amdt dated July 15, 2004 

selecting a relevant set of concept vectors including any concept vector in which at least 
one concept component exceeds a second threshold value, 

calculating a non-normalized term-to-concept vector as the sum of all concept vectors in 
the relevant set, and 

normalizing the non-normalized term-to-concept vector. 

3. (currently amended) A method according to claim 1 wherein any on e of th e pr e c e ding 
claims, characteriz e d by the generation of the term-term matrix comprising comprises : 

retrieving, for each term in each combination of two unique terms in the document 
corpus, a respective term-to-concept vector, 

generating a relation vector describing the relationship between the terms in each 
combination of two unique terms, each component in the relation vector being equal to a lowest 
component value of corresponding component values in the term-to-concept vectors, 

generating a relationship value for each combination of two unique terms as the sum of 
all component values in the corresponding relation vector, and 

generating a matrix containing the relationship values of all combinations of two unique 
terms in the document corpus. 

4. (currently amended) A method according to claim 1 wherein the method further 
comprises the steps of: any on e of th e prec e ding claims, characteriz e d by 

calculating a statistical co-occurrence value between each combination of two unique 
terms in the document corpus, the statistical co-occurrence value describing a dependent 
probability that a certain second term exists in a document provided that a certain first term 
exists in the document, and 

incorporating the statistical co-occurrence values into the term-term matrix to represent 
lexical relationships between the terms in the document corpus. 

5. (currently amended) A method according to claim 1 wherein the method further 
comprises the step of: any on e of th e prec e ding claims, characteriz e d by 

displaying the processed textual information on a format being adapted for human 
comprehension. 



3 



Appl. No".: to be assigned 
Preliminary Amdt. dated July 15, 2004 

6. (currently amended) A method according to claim 5 , charact e riz e d by wherein the 
displaying step further comprises involving presentation of at l e ast on e of: 

at least one document identifier specifying a document being relevant with respect at 
least one term in a query, 

at least one term being related to a term in a query, and 

a conceptual distribution representing a conceptual relationship between two or more 
terms in the document corpus, the conceptual distribution being based on shared concepts which 
are common to said terms. 

7. (currently amended) A method according to claim _5 J5, characteriz e d by wherein the 
displaying step further comprises involving presentation of at least one document identifier 
specifying a document being relevant with respect to at least one term in a query in combination 
with at least one user specified concept. 

8. (currently amended) A method according to claim 6 7^ charact e riz e d by wherein the 
method further comprises the step of: selecting the at least one user specified concept from the 
shared concepts in the conceptual distribution. 

9. (currently amended) A method according to any one of th e claims-5 — 8, charact e riz e d by 
wherein the method further comprises the step of: illustrating the conceptual relationship 
between a first term and at least one second term by means of a respective relevance measure 
being associated with the at least one second term in respect of the first term. 

10. (currently amended) A method according to claim 9, characterized by wherein the 
method further comprises the step of: displaying the processed textual information on a 
graphical format which visualizes the strength in the conceptual relationship between at least two 
terms. 

1 1 . (currently amended) A method according to any on e of the claims 9 or 10, charact e riz e d 
by wherein the method further comprises the steps of: 
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displaying the processed textual information as a distance graph in which each term 
constitutes a node T wherein the a -node representing a first term is connected to one or more other 
nodes representing secondary terms to which the first term has a conceptual relationship of at 
least a specific strength, and the relevance measure between the first term and the at least one 
second term is represented by a minimum number of node hops between the first term and the at 
least one second term. 

12. (currently amended) A method according to any on e of th e claims 9 or 10, characteriz e d 
by wherein the method further comprises the step of: 

displaying the processed textual information as a distance graph in which each term 
constitutes a node 7 wherein the a - node representing a first term is connected to one or more other 
nodes representing secondary terms to which the first term has a conceptual relationship, each 
connection is associated with an edge weight representing the strength of a conceptual 
relationship between the first term and a particular secondary term, and the relevance measure 
between the first term and a particular secondary term is represented by an accumulation of the 
edge weights being associated with the connections constituting a minimum number node hops 
between the first term and the particular secondary term. 

13. (currently amended) A method according to claim l any on e of th e pr e c e ding claims , 
charact e riz e d by wherein each term repr e s e nting on e of: further comprises: 

a single word, 
a proper name, 
a phrase, and 

a compound of single words. 

14. (currently amended) A method according to claim 1 further comprises the step of a ny 
on e of the preceding claims, charact e riz e d by updating the document corpus with added data in 
form of at least one new document by means o£ 

identifying any added terms in the new document which lack a representation in the 
document corpus, 
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identifying any existing terms in the new document which were represented in the 

document corpus before adding the at least one new document, 

retrieving, for each of the existing terms, a corresponding concept vector, 

generating a new concept vector with respect to the at least one new document as a sum 

of the corresponding concept vectors, 

normalizing the new concept vector into a normalized new concept vector, and 
assigning the normalized new concept vector to each of the added terms in the new 

document. 

1 5 . (currently amended) A computer program directly loadable into the internal memory of a 
digital computer, comprising software for performing tho stops of any of the claims 1 — 11 when 
r . niH prngrnm in run on a comput e r , a method of processing digitized textua l information, the 
information being organized in terms, documents and document corp ora, where each document 
contains at least one term and each document corpus contains at least one document, the method 
comprising: 

generating a concept vector for each document in a document corpus? wherein the 

concept vector conceptually classifying the contents of the document on a relativ ely compact 
format, 

generating, for each term in the document corpus, a term-to-concept vector describing a 

relationship between the term and each of the concept vectors wherein the t erm-to-concept 
vectors being generated on basis of the concept vectors, 

receiving the term-to-concept vectors for the document corpus and on basis thereof 

generating a term-term matrix describing a term-to-term relationship between the terms in the 
document corpus, and 

processing the term-term matrix into processed textual information. 

16. (currently amended) A computer readable medium, having a program recorded thereon, 
where the program is to make a computer perform tho steps of any of the claims 1 — 14. a method 
of processing digitized textual information, the information being organized in terms, documents 
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and document corpora, where each document contains at least one term and each document 
corpus contains at least one document, the method comprising: 

generating a concept vector for each document in a document corpus? wherein the 

concept vector conceptually classifying the contents of the document on a relatively compact 
format, 

generating, for each term in the document corpus, a term-to-concept vector describing a 

relationship between the term and each of the concept vectors wherein the term-to-concept 
vectors being generated on basis of the concept vectors, 

receiving the term-to-concept vectors for the document corpus and on basis thereof 

generating a term-term matrix describing a term-to-term relationship between the terms in the 
document corpus, and 

processing the term-term matrix into processed textual information. 

17. (currently amended) A search engine for processing an amount of digitized textual 
information and extracting data there from, the information being organized in terms, documents 
and document corpora, where each document contains at least one term and each document 
corpus contains at least one document, comprising: 

an interface adapted to receive a query (Q) from a user, and 
a processing unit (150) adapted to process a document corpus on basis of the query (Q) 
and return processed textual information (R) being relevant to the query (Q)? said process 
involving 

generating a concept vector for each document in the document corpus, the concept 
vector conceptually classifying the contents of the document on a relatively 1 compact format, and 

generating, for each term in the document corpus, a term-to-concept vector describing a 
relationship between the term and each of the concept vectors, wherein characteriz e d in that the 
processing unit (150) in turn comprises: 

a processing module (15+) adapted to receive the term-to-concept vectors for the 
document corpus and on basis thereof generate a term-term matrix describing a term-to-term 
relation-ship between the terms in the document corpus, and 
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an exploring module (443) adapted to receive the query (Q) and the term-term matrix, 
and on basis of the query ((^-process the term-term matrix into the processed textual 
information (R)r 

18. (currently amended) A database (430) holding an amount of digitized textual information 
being organized in terms, documents and document corpora, wherein each document contains at 
least one term and each document corpus contains at least one document, wherein each 
document in a document corpus being associated with concept vector which conceptually 
classifies the contents of the document on a relatively compact format, and wherein each term in 
the document corpus being associated with a term-to-concept vector describing a relationship 
between the term and each of the concept vectors, wherein characteriz e d in that it is adapted to 
deliver the term-to concept vectors to a search engine (115) according to the claim 17. 

19. (currently amended) A database (130) according to claim 18 characteriz e d in that it 
compris e s further comprising an iterative term-to-concept engine adapted to receive fresh 
digitized textual information added to the database (i30) and on basis of this informationr 

generate concept vectors for any added document, and 

generate a term-to-concept vector describing a relationship between any added term and 
each of the concept vectors. 

20. (currently amended) A server (110) for providing data processing services in respect of 
digitized textual information, wherein charact e riz e d in that it the server comprising: compris e s 

a search engine (445) according to claim 17. and for processing an amount of digitized 
textual information and extracting data there from, the information being organized in terms, 
documents and document corpora, where each document contains at least one term and each 
document corpus contains at least one document, comprising an interface adapted to receive a 
query from a user, and a processing unit adapted to process a document corpus on basis of the 
query and return processed textual information being relevant to the query said process 
involving generating a concept vector for each document in the document corpus, the concept 
vector conceptually classifying the contents of the document on a relatively compact format, and 
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generating, for each term in the document corpus, a term-to-concept vector describing a 
relationship between the term and each of the concept vectors, wherein the processing unit in 
turn comprises a processing module adapted to receive the term-to-concept vectors for the 
document corpus and on basis thereof generate a term-term matrix describing a term-to-term 
relation-ship between the terms in the document corpus, and an exploring module adapted to 
receive the query and the term-term matrix, and on basis of the query process the term-term 
matrix into the processed textual information, and 

a communication interface (142) towards a database (±59) according to any on e of th e 
claims claim 18 or 19 . 

21 . (currently amended) A system for providing data processing services in respect of 
digitized textual information, wherein charact e riz e d in that it the system comprising: compris e s 

a server (110) according to claim 20, 

at least one user client (150) adapted to communicate with the server (110) , and 
a communication link (1 4 1; \A2) connecting the at least one user client (120) with the 
server (110) . 

22. (currently amended) A system according to claim 21 further comprising , charact e riz e d in 

TTTCIT 

an internet (140) accomplishes at least a part of the communication link (1 4 1; 1 4 2) , and 
the at least one user client (430) comprises a web browser, (121) which in turn provides: 

a user input interface (121x) adapted to receive queries (Q) from a user and forward the 
queries (Q) to the server (140) via the communication link (1 4 1) , and 

a user output interface (121b) adapted to receive processed textual information (R) from 
the server (110) via the communication link (1 4 2) and present the processed textual information 
(R) to the user. 

23. (currently amended) A method of processing digitized textual information, the 
information being organized in terms, documents and document corpora, where each document 
contains at least one term and each document corpus contains at least one document, the method 
comprising: 
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identifying a particular document corpus (910) , 

filtering the identified document corpus wherein a number of documents fulfilling at least 
one specified criterion are selected (920; 950, 960) , and 

producing a new document corpus exclusively containing the selected documents (930; 

Q7Q \ 

24. (currently amended) A method according to claim 23 wherein , charact e riz e d by the 
filtering involvin g -comprises the further steps of : 

identifying a number of document clusters in the identified document corpus by means of 
a document clustering algorithm (920a) , 

generating, for each identified document cluster, a representative document vector by 
means of the document clustering algorithm (920b, 920c, 920 e ) , and 

removing all non-clustered documents from the identified document corpus (920d) . 

25. (currently amended) A method according to claim 23 wherein , charact e riz e d by the 
filtering involvin g comprises the further steps of : 

receiving a user input specifying at least one of one or more concepts and one or more 
terms (950), 

selecting, from the identified document corpus, documents being related to at least one of 
the concepts or the terms (960) , and 

removing all non-selected documents from the identified document corpus. 

26. (currently amended) A method according to any on e of th e claims claim 23 25, 
charact e riz e d by wherein the identified document corpus having been processed according to the 
m e thod according to any on e of th e claims 1 1 4. a method comprising the following steps: 

generating a concept vector for each document in a document corpus wherein the concept 
vector conceptually classifying the contents of the document on a relatively compact format, and 

generating, for each term in the document corpus, a term-to-concept vector describing a 

relationship between the term and each of the concept vectors wherein the term-to-concept 
vectors being generated on basis of the concept vectors, 
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receiving the term-to-concept vectors for the document corpus and on basis thereof 

generating a term-term matrix describing a term-to-term relationship between the terms in the 
document corpus, and 

processing the term-term matrix into processed textual information. 
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